Regression Diagnostics: - Outlier, leverage, and influence - added variable and marginal model plots
After this lab you will be able to: - assess and diagnose the extent to which outlying observations are driving your results - assess the impact of a given variablie within a multiple regression model
This lab uses materials by - Angela Dixon - Andrew Bray - Andrew Bray and Mine Cetinkaya-Rundel - Brian Caffo, Jeff Leek, Roger Peng
To assess whether the linear model is reliable, we need to check for (1) linearity, (2) nearly normal residuals, and (3) constant variability.
Let’s take a look the usage of “rail trails,” which are trail systems that are built on old rail grades; we will explore the relationship between temperature and ridership. Load in the RailTrail data set in the mosaicData package:
head(RailTrail)
## hightemp lowtemp avgtemp spring summer fall cloudcover precip volume
## 1 83 50 66.5 0 1 0 7.6 0.00 501
## 2 73 49 61.0 0 1 0 6.3 0.29 419
## 3 74 52 63.0 1 0 0 7.5 0.32 397
## 4 95 61 78.0 0 1 0 2.6 0.00 385
## 5 44 52 48.0 1 0 0 10.0 0.14 200
## 6 69 54 61.5 1 0 0 6.6 0.02 375
## weekday
## 1 1
## 2 1
## 3 1
## 4 0
## 5 1
## 6 1
volume) and high temperature that day (hightemp):I certainly appears that the linear model is worth trying. It looks like there might be a bit of a curvilinear relationship at high temperature levels, as rideship appears to bo up to about 80 degrees and then decrease again, but we’ll try a linear fit and go from there.
volume on hightempmod.rider <- lm(volume~hightemp,data=RailTrail)
volume of ridership and the high temperature hightemp.summary(mod.rider)
##
## Call:
## lm(formula = volume ~ hightemp, data = RailTrail)
##
## Residuals:
## Min 1Q Median 3Q Max
## -254.562 -57.800 8.737 57.352 314.035
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17.079 59.395 -0.288 0.774
## hightemp 5.702 0.848 6.724 1.71e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 104.2 on 88 degrees of freedom
## Multiple R-squared: 0.3394, Adjusted R-squared: 0.3319
## F-statistic: 45.21 on 1 and 88 DF, p-value: 1.705e-09
Each 1 degree increase in daily high temperature is predicted to increase ridership volume by 5.7 people, holding all other variables constant (though obviously there aren’t other variables in this model).
Test whether the conditions for regression appear reasonable:
Linearity: You already checked if the relationship between ridership and temperature is linear using a scatterplot. We should also verify this condition with a plot of the residuals vs. temperature.
plot(rstandard(mod.rider) ~ RailTrail$hightemp)
abline(h = 0, lty = 3)
abline(h= 2,lty=2)
abline(h=-2,lty=2)
As we might have guess given the basic scatterplot, ridership appears to be be somewhat linear in high temperature, but it appears that the linear model does not perform well at very high or very low temperatures. Namely, the line basically overpredicts ridership every time at high and low temperatures, when in an ideal-fitting model the residuals would be distributed around zero at all levels of temperature.
Nearly normal residuals: To check this condition, we can look at a histogram
hist(mod.rider$residuals,breaks=10)
or a normal probability plot of the residuals.
qqnorm(mod.rider$residuals)
qqline(mod.rider$residuals) # adds diagonal line to the normal prob plot
mod.rider$fitted
## 1 2 3 4 5 6 7 8
## 456.1766 399.1578 404.8597 524.5991 233.8034 376.3503 359.2447 359.2447
## 9 10 11 12 13 14 15 16
## 439.0710 433.3691 427.6672 353.5428 216.6977 319.3315 268.0146 290.8221
## 17 18 19 20 21 22 23 24
## 536.0029 410.5616 342.1390 222.3996 382.0522 307.9278 387.7541 245.2071
## 25 26 27 28 29 30 31 32
## 330.7353 444.7729 347.8409 307.9278 336.4372 456.1766 416.2635 404.8597
## 33 34 35 36 37 38 39 40
## 319.3315 421.9653 279.4184 364.9466 273.7165 216.6977 496.0898 382.0522
## 41 42 43 44 45 46 47 48
## 444.7729 393.4559 382.0522 376.3503 319.3315 325.0334 393.4559 382.0522
## 49 50 51 52 53 54 55 56
## 416.2635 302.2259 359.2447 461.8785 313.6297 302.2259 433.3691 364.9466
## 57 58 59 60 61 62 63 64
## 410.5616 353.5428 382.0522 347.8409 273.7165 421.9653 325.0334 404.8597
## 65 66 67 68 69 70 71 72
## 461.8785 330.7353 262.3127 387.7541 382.0522 456.1766 496.0898 507.4935
## 73 74 75 76 77 78 79 80
## 484.6860 336.4372 473.2822 364.9466 319.3315 490.3879 342.1390 319.3315
## 81 82 83 84 85 86 87 88
## 404.8597 439.0710 296.5240 336.4372 404.8597 524.5991 353.5428 296.5240
## 89 90
## 490.3879 439.0710
The histogram of residuals actually looks pretty good. There is a slight right skew,but overall the distribution of residuals does not evidence a major problem. The q-q plot helps illuminat the problem a little better, since it’s clear that the residuals at either tail do not quite fit the linear model. Again though, recall that q-q plot tails don’t even look perfect for simulated data from a known distribution, so again this is probably not evidence of a major problem. This is a good example of how you need to check for multiple modeling issues, since some diagnostics might look just fine even when others do not.
Constant variability: There are tests for heteroskedasticity, but generally you can use plots as a rough heuristic at least when doing preliminary fitting. Constant variability means that the variability of points around the regression line remains consistant for all values of X. To test this, we again use the plot of residuals ~ independent variable to check this, and also the residuals ~ fitted values plot:
plot(mod.rider$residuals ~ RailTrail$hightemp)
abline(h = 0, lty = 3) # adds a horizontal dashed line at y = 0
plot(mod.rider$residuals ~ mod.rider$fitted.values)
abline(h = 0, lty = 3) # adds a horizontal dashed line at y = 0
(they should look just about the same - why?)
It’s not quite clear, but it doesn’t look like constant variability is met since we observed fairly large variability at medium-range temperatures and lower variability at high and low temps. Let’s run a quick Brusch-Pagan test:
library(lmtest)
## Loading required package: zoo
##
## Attaching package: 'zoo'
##
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
bptest(mod.rider)
##
## studentized Breusch-Pagan test
##
## data: mod.rider
## BP = 4.7222, df = 1, p-value = 0.02977
Hmm, it looks like we have a small heteroskedasticity problem, since \(p < 0.05\). Let’s move on for now, but strategies you might use include transforming a variable (e.g., log), weighted regression, and in this case adding another variable might fix the problem (can you guess what?).
One other thing:
par(mfrow=c(2,2))
plot(mod.rider)
is a shortcut to look at model diagnostics very quickly. How do these results compare with what we did above?
Outliers are points that don’t fit the trend in the rest of the data.
High leverage points have the potential to have an unusually large influence on the fitted model.
Influential points are high leverage points that cause a very different line to be fit than would be with that point removed.
Recall that there are numerous ways of assessing the influence that a given observation has on your model…
?influence.measures to see the full suite of influence measures in stats. The measures includerstandard - standardized residuals, residuals divided by their standard deviations)rstudent - standardized residuals, residuals divided by their standard deviations, where the ith data point was deleted in the calculation of the standard deviation for the residual to follow a t distributionhatvalues - measures of leveragedffits - change in the predicted response when the \(i^{th}\) point is deleted in fitting the model.dfbetas - change in individual coefficients when the \(i^{th}\) point is deleted in fitting the model.cooks.distance - overall change in the coefficients when the \(i^{th}\) point is deleted.resid - returns the ordinary residualsresid(fit) / (1 - hatvalues(fit)) where fit is the linear model fit returns the PRESS residuals, i.e. the leave one out cross validation residuals - the difference in the response and the predicted response at data point \(i\), where it was not included in the model fitting.One of the more common tools is standardized residuals. Standardized residuals are essentially “z-score residuals”, so if you are using a \(95%\) confidence level, you can take a quick look to see if any standardized residuals are greater than 2 (or 1.96) or less than -2 (-1.96):
par(mfrow=c(1,1))
plot(rstandard(mod.rider)~ RailTrail$hightemp)
abline(h = 2, lty = 3,col='red') # adds a horizontal dashed line at y = 2
abline(h = -2, lty = 3,col='red') # adds a horizontal dashed line at y = 2
This looks pretty good. We have 90 obs, so we would expect 4 observations or so to have outlying residuals.
Studentized residuals are just like standardized residuals, but we leave out a given point when computing the sd. That way, the very point you are trying to analzye does not factor into the standardization. In all likelihood, in most cases you won’t notice much difference between the two.
plot(rstudent(mod.rider)~ RailTrail$hightemp)
abline(h = 2, lty = 3,col='red') # adds a horizontal dashed line at y = 2
abline(h = -2, lty = 3,col='red') # adds a horizontal dashed line at y = 2
As you might expect, there is not much difference.
We can assess leverage by plotting hatvalues….
hat <- hatvalues(mod.rider)
plot(hat)
dfbetas and Cook’s Distance are both common ways to measure influence:
dfbetas are the difference in \(\beta_k\) when observation \(i\) is left out of the model. We care about how each point influences \(\beta_1\) in this case, not \(\beta_0\), so we’ll only plot the second column of results.
modbetas = dfbetas(mod.rider)
plot(modbetas[,2])
Cook’s Distance is an alternative measure. Cook’s Distance is a summary metric that captures the total change in all model parameters due to a given observation. For this reason, there is no absolute standard of waht is a “large” or “small” Cook’s distance. The most general criteria is that a point is highly influential when \(D_i>1\). However, the number of observations obviously directly influences the influence any one point can have independent of the values of the observation, so \(D_i > 4/n\) is also a common criteria.
modcooks = cooks.distance(mod.rider)
plot(modcooks)
abline(h=4/nrow(RailTrail),col='red')
It does appear that there are a few influential points that serve to change the coefficient estimate. In particular, there are several observations that pull the coefficient down quite significantly. My hunch is that these are rainy days where nobody wanted to rider their bike.
It’s tought to know how useful a bubble plot is, but it’s fun to make!
plot(hatvalues(mod.rider), rstudent(mod.rider), type='n')
cook <- sqrt(cooks.distance(mod.rider))
points(hatvalues(mod.rider), rstudent(mod.rider), cex=10*cook/max(cook))
abline(h=c(-2,0,2), lty=2)
abline(v=c(2,3) * 3/45, lty=2)
robust regression downweights influential data
library(MASS)
mod.rider.robust = rlm(volume~hightemp,data=RailTrail)
summary(mod.rider.robust)
##
## Call: rlm(formula = volume ~ hightemp, data = RailTrail)
## Residuals:
## Min 1Q Median 3Q Max
## -253.802 -55.176 8.995 58.812 314.594
##
## Coefficients:
## Value Std. Error t value
## (Intercept) -25.3436 56.8986 -0.4454
## hightemp 5.8019 0.8124 7.1421
##
## Residual standard error: 86.29 on 88 degrees of freedom
You could also delete problem points, but I would strongly recommend avoiding this if at all possible, unless you know with great confidence that a data point is an error and not simply an outlying observation.
library(MASS)
mod.rider.delete = lm(volume~hightemp,data=RailTrail[cooks.distance(mod.rider)<=4/nrow(RailTrail),])
summary(mod.rider.delete)
##
## Call:
## lm(formula = volume ~ hightemp, data = RailTrail[cooks.distance(mod.rider) <=
## 4/nrow(RailTrail), ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -262.83 -57.10 3.22 54.76 268.79
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -69.656 57.118 -1.220 0.226
## hightemp 6.513 0.828 7.866 1.12e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 94.64 on 84 degrees of freedom
## Multiple R-squared: 0.4242, Adjusted R-squared: 0.4173
## F-statistic: 61.87 on 1 and 84 DF, p-value: 1.124e-11
library(texreg)
## Version: 1.35
## Date: 2015-04-25
## Author: Philip Leifeld (University of Konstanz)
##
## Please cite the JSS article in your publications -- see citation("texreg").
screenreg(list(mod.rider,mod.rider.robust,mod.rider.delete))
##
## ============================================
## Model 1 Model 2 Model 3
## --------------------------------------------
## (Intercept) -17.08 -25.34 -69.66
## (59.40) (56.90) (57.12)
## hightemp 5.70 *** 5.80 6.51 ***
## (0.85) (0.81) (0.83)
## --------------------------------------------
## R^2 0.34 0.42
## Adj. R^2 0.33 0.42
## Num. obs. 90 90 86
## RMSE 104.19 94.64
## ============================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
Fairly similar results, but looks like we have alleviated the issue where those few (potentially) rainy days pulls the coefficient downwards. The deletion procedure model has a MUCH higher coefficient, but again, we probably don’t want this - those data still matter (unless you think it will never rain again or something).
Residual plots are somewhat problematic for multiple regression, because we have many different input varaibles. Instead, we will use “added variable plots”
In simple linear regression we use residual plots to assess:
We fit the model:
\[ y \sim x_1 + x_2 \]
library(alr3) #package that accompanies S. Weisberg, Applied Regression Regression, Third Edition, Wiley, 2005
data(caution)
m1 <- lm(y~x1+x2, data = caution)
plot(m1, 1)
If this was a bivariate model, we could conclude that the mean function looks fairly linear but there the errors appear to have increasing variance. However, these are fake data generated from a model with constant variance!!!
StanRes1 <- rstandard(m1)
plot(caution$x1,StanRes1, ylab="Standardized Residuals")
plot(caution$x2,StanRes1, ylab="Standardized Residuals")
In MLR, in general, you cannot infer the structure you see in the residuals vs. fitted plot as being the structure that was misspecified.
The only conclusion you can draw is that something is misspecified.
So now what?
Although several types of invalid models can create non-constant variance in the residuals, a valid model will always be structureless.
If you can be sure you have a good mean function, then the residual plot is more informative.
The objective of constructing an added variable plot is to assess how much each variable adds to your model.
Consider the some data concerning restaurants in NYC, where we’d like to build the model:
\[ Price \sim Food + Decor + Service + East \]
We can assess the isolated effect of each predictor on the response with a series of simple scatterplots…
nyc <- read.csv("http://andrewpbray.github.io/data/nyc.csv")
par(mfrow=c(2,2))
plot(Price ~ Food, data = nyc)
abline(lsfit(nyc$Food,nyc$Price))
plot(Price ~ Decor, data = nyc)
abline(lsfit(nyc$Decor,nyc$Price))
plot(Price ~ Service, data = nyc)
abline(lsfit(nyc$Service,nyc$Price))
plot(Price ~ East, data = nyc)
abline(lsfit(nyc$East,nyc$Price))
This might be more efficient…
pairs(Price ~ Food + Decor + Service + East, data = nyc)
But this does not provide a way to look at a variable net of other variables. Instead, an added variable plot tells you how much a given predictor \(x_i\) can explain the response after the other predictors have been taken into account. An “av-plot” has:
On the y-axis, the residuals from the model predicting the response without \(x_i\).
On the x-axis, the residuals from predicting \(x_i\) using those same predictors.
First, get the residuals from the model
\[ Price \sim Decor + Service + East \]
resY <- lm(Price ~ Decor + Service + East, data = nyc)$res
Second, get the residuals from the model
\[ Food \sim Decor + Service + East \]
resX <- lm(Food ~ Decor + Service + East, data = nyc)$res
The plot them against each other…
plot(resY ~ resX)
The car package has an avPlots() function that does this for you…
library(car)
m1 <- lm(Price ~ Food + Decor + Service + East, data = nyc)
avPlot(m1,variable = "Food")
avPlots(m1)
Notice that if we fit a line through the AVP, the slope should look familiar…
AVPm1 <- lm(resY ~ resX)
AVPm1$coef
## (Intercept) resX
## 5.07403e-17 1.53812e+00
m1$coef
## (Intercept) Food Decor Service East
## -24.023799670 1.538119941 1.910087113 -0.002727483 2.068050156
AVPs can be used to assess whether it makes sense to include an additional variable in the model (similar to looking at the p-value of the predictor).
They’re a bit more informative, though, since they would also indicate if the relationship between that predictor and the response is linear in the context of the other variables.
Let’s look at home prices in LA…
LA <- read.csv("http://andrewpbray.github.io/data/LA.csv")
plot(price ~ sqft, data = LA, col = "steelblue")
In the data set LA, this scatterplot suggests two influential points but are they influential in a multiple regression model?
influence(m1)$hat.)mod.la = lm(price ~ sqft + bed + city,data=LA)
summary(mod.la)
##
## Call:
## lm(formula = price ~ sqft + bed + city, data = LA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8563518 -293754 170468 512647 45112021
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.284e+06 2.016e+05 -6.368 2.5e-10 ***
## sqft 1.867e+03 3.795e+01 49.207 < 2e-16 ***
## bed -6.165e+05 5.376e+04 -11.467 < 2e-16 ***
## cityLong Beach 6.662e+05 1.582e+05 4.210 2.7e-05 ***
## citySanta Monica 7.384e+05 1.906e+05 3.874 0.000112 ***
## cityWestwood 5.494e+05 2.338e+05 2.349 0.018924 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1775000 on 1588 degrees of freedom
## Multiple R-squared: 0.7536, Adjusted R-squared: 0.7528
## F-statistic: 971.3 on 5 and 1588 DF, p-value: < 2.2e-16
hat <- hatvalues(mod.la)
plot(hat)
Definitely something funny going on here.
levs <- influence(m1)$hat
hist(levs,breaks=100)
abline(v = 2 * length(m1$coef) / nrow(LA), col = "red")
tail(sort(levs))
## 97 69 130 115 117 168
## 0.06355183 0.06739365 0.07181092 0.07429460 0.20746530 0.21011533
e_hat <- m1$res
s <- sqrt(1/(nrow(LA) - length(m1$coef)) * sum(e_hat^2))
r <- e_hat/(s * sqrt(1 - levs))
hist(r)
tail(sort(r))
## 132 103 48 130 30 56
## 6.475015 7.223771 7.243349 8.989756 9.052396 9.909576
Yes, very high.
cdist <- (r^2 / length(m1$coef)) * (levs/(1 - levs))
tail(sort(cdist))
## 139 103 50 83 56 130
## 0.3183459 0.3271137 0.3829354 0.4231406 1.0360441 1.2504889
plot(m1, 5)
Yep, highly influential as well.
modcooks = cooks.distance(mod.la)
tail(sort(modcooks))
## 1254 1260 1291 1289 1294 1292
## 0.05408505 0.05847561 0.08015348 0.09034382 4.16812407 23.95308503
plot(modcooks)
abline(h=1,col='red')
table(modcooks > 1)
##
## FALSE TRUE
## 1592 2
Note that we us 1 here instead of 4/nrow(n) because the sample is much larger. If we were to use 4/nrow(n), the heuristic would be 0.0025094, and basically every observation would qualify.
library(car)
avPlots(m1)
library(dplyr)
##
## Attaching package: 'dplyr'
##
## The following object is masked from 'package:MASS':
##
## select
##
## The following objects are masked from 'package:stats':
##
## filter, lag
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
LA <- mutate(LA, logprice = log(price), logsqft = log(sqft))
m2 <- lm(logprice ~ logsqft + bed + city, data = LA)
summary(m2)
##
## Call:
## lm(formula = logprice ~ logsqft + bed + city, data = LA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.26020 -0.24897 -0.01613 0.21804 1.37277
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.13068 0.21201 24.200 <2e-16 ***
## logsqft 1.20729 0.03036 39.769 <2e-16 ***
## bed -0.03010 0.01284 -2.345 0.0191 *
## cityLong Beach -0.88280 0.03467 -25.464 <2e-16 ***
## citySanta Monica -0.09416 0.04022 -2.341 0.0194 *
## cityWestwood -0.46244 0.04876 -9.484 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3704 on 1588 degrees of freedom
## Multiple R-squared: 0.8742, Adjusted R-squared: 0.8738
## F-statistic: 2206 on 5 and 1588 DF, p-value: < 2.2e-16
avPlots(m2)
Overall, this model should look quite a bit better.
par(mfrow=c(2,2))
plot(m2)
defects <- read.table("http://www.stat.tamu.edu/~sheather/book/docs/datasets/defects.txt",
header = TRUE)
head(defects)
## Case Temperature Density Rate Defective
## 1 1 0.97 32.08 177.7 0.2
## 2 2 2.85 21.14 254.1 47.9
## 3 3 2.95 20.65 272.6 50.9
## 4 4 2.84 22.53 273.4 49.7
## 5 5 1.84 27.43 210.8 11.0
## 6 6 2.05 25.42 236.1 15.6
Let’s look at the pairwise comparisons…
pairs(Defective ~ Temperature + Density + Rate, data = defects)
Here’s our basic model…
\[ \widehat{Defective} \sim Temperature + Density + Rate \]
m1 <- lm(Defective ~ Temperature + Density + Rate, data = defects)
summary(m1)$coef
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.3243900 65.9264798 0.1566046 0.87676619
## Temperature 16.0779089 8.2941050 1.9384742 0.06349394
## Density -1.8272927 1.4970684 -1.2205807 0.23319949
## Rate 0.1167322 0.1306268 0.8936312 0.37971687
View a summary of model diagnostics:
View residuals plotted against each covariate…
par(mfrow = c(2, 2))
r <- rstandard(m1)
plot(r ~ defects$Temperature)
plot(r ~ defects$Density)
plot(r ~ defects$Rate)
Used to assess whether your model is doing a good job of modeling the response. If it is, you’ll see points along the identity line. If it’s not, there will be non-linear structure try to correct by transforming the response and assess on a predictor-by-predictor basis using marginal model plots.
The standardized residual plots and the plot of \(y\) on \(\hat{y}\) suggest that something is amiss, but what? We need to be sure that the structure in the data is being mirrored well by the structure in our model. This comparison is made for each predictor using the marginal model plot.
Marginal Model Plots are used to assess the marginal relationship between each predictor and the response. -It compares the fit from the model with the nonparametric fit to the scatterplot. -If your model is well-specified, these two lines will be close to coincident.
You can build them by hand using loess() or use mmp() in the car package.
Now, load the car package (if you haven’t already) and produce the marginal model plots…
par(mfrow=c(2, 2))
library(car)
mmp(m1, defects$Temperature)
mmp(m1, defects$Density)
mmp(m1, defects$Rate)
Or you can use loess to make these by hand:
plot(Defective ~ Temperature, data = defects)
lines(sort(defects$Temperature), sort(m1$fit), lwd = 2)
l1 <- loess(m1$fit ~ defects$Temperature)
lines(sort(l1$x), sort(l1$fit), lwd = 2, col = "red", lty = 2)
plot(Defective ~ Temperature, data = defects)
lines(sort(defects$Temperature), sort(m1$fit), lwd = 2)
l1 <- loess(m1$fit ~ defects$Temperature)
lines(sort(l1$x), sort(l1$fit), lwd = 2, col = "red", lty = 2)
l2 <- loess(Defective ~ Temperature, data = defects)
lines(sort(l2$x), sort(l2$fit), lwd = 2, col = "blue", lty = 2)
\[ \widehat{\sqrt{Defective}} \sim Temperature + Density + Rate \]
defects <- transform(defects, sqrtDefective = sqrt(Defective))
m2 <- lm(sqrtDefective ~ Temperature + Density + Rate, data = defects)
summary(m2)$coef
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.59297089 5.26400977 1.062492 0.29778175
## Temperature 1.56516337 0.66225665 2.363379 0.02586654
## Density -0.29166383 0.11953593 -2.439968 0.02181531
## Rate 0.01289857 0.01043012 1.236666 0.22726602
Marginal model plots for second model
How do these look?